Saving Stochastic Bandits from Poisoning Attacks via Limited Data Verification

نویسندگان

چکیده

This paper studies bandit algorithms under data poisoning attacks in a bounded reward setting. We consider strong attacker model which the can observe both selected actions and their corresponding rewards, contaminate rewards with additive noise. show that any algorithm regret O(log T) be forced to suffer O(T) an expected amount of contamination T). is also necessary, as we prove there exists algorithm, specifically classical UCB, requires Omega(log Omega(T). To combat such attacks, our second main contribution propose verification based mechanisms, use limited access number uncontaminated rewards. In particular, for case unlimited verifications, simple modified version Explore-then-Commit type restore order optimal irrespective used by attacker. provide UCB-like scheme, called Secure-UCB, enjoys full recovery from verifications. derive matching lower bound on order-optimal this verifications necessary recover regret. On other hand, when above budget B, novel Secure-BARBAR, provably achieves O(min(C,T/sqrt(B))) high probability against weak attackers (i.e., who have place before seeing actual pulls algorithm), where C total attacker, breaks known Omega(C) non-verified setting if large.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data Poisoning Attacks against Autoregressive Models

Forecasting models play a key role in money-making ventures in many different markets. Such models are often trained on data from various sources, some of which may be untrustworthy. An actor in a given market may be incentivised to drive predictions in a certain direction to their own benefit. Prior analyses of intelligent adversaries in a machine-learning context have focused on regression an...

متن کامل

Certified Defenses for Data Poisoning Attacks

Machine learning systems trained on user-provided data are susceptible to data poisoning attacks, whereby malicious users inject false training data with the aim of corrupting the learned model. While recent work has proposed a number of attacks and defenses, little is understood about the worst-case loss of a defense in the face of a determined attacker. We address this by constructing approxi...

متن کامل

Generating random media from limited microstructural information via stochastic optimization

Random media abound in nature and in manmade situations. Examples include porous media, biological materials, and composite materials. A stochastic optimization technique that we have recently developed to reconstruct realizations of random media ~given limited microstructural information in the form of correlation functions! is investigated further, critically assessed, and refined. The recons...

متن کامل

Stochastic Rank-1 Bandits

We propose stochastic rank-1 bandits, a class of online learning problems where at each step a learning agent chooses a pair of row and column arms, and receives the product of their values as a reward. The main challenge of the problem is that the individual values of the row and column are unobserved. We assume that these values are stochastic and drawn independently. We propose a computation...

متن کامل

Sparse Stochastic Bandits

In the classical multi-armed bandit problem, d arms are available to the decision maker who pulls them sequentially in order to maximize his cumulative reward. Guarantees can be obtained on a relative quantity called regret, which scales linearly with d (or with √ d in the minimax sense). We here consider the sparse case of this classical problem in the sense that only a small number of arms, n...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2022

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v36i7.20777